Tolerating Arbitrary Failures With State Machine Replication

نویسندگان

ASSIA DOUDOU

BENOÎT GARBINATO

RACHID GUERRAOUI

چکیده

The growing reliance, in our daily lives, on services provided by distributed applications (e.g., air-traffic control, public switched telephone networks, electronic commerce, etc.) renders us vulnerable to the failures of these services. The challenge of fault tolerance consists in providing services that survive to the occurrence of failures. The design and verification of fault-tolerant distributed applications is however viewed as a difficult task. In recent years, several paradigms have fortunately been identified which simplify this task. Key among these paradigms is state machine replication [12, 15, 19]. The underlying idea is intuitively simple. In short, every crucial service that needs to be made fault tolerant is replicated on several computers that are supposed to fail independently. The presence of several replicas ensures the high availability of the service. To preserve the consistency of the service, invocations of its replicas, even if coming from different clients, are then handled in such a way that they reach the replicas in the same order. The abstraction that provides this guarantee is called the total order broadcast primitive. Roughly speaking, this communication primitive ensures that messages broadcast within a group of processes are delivered in the same order, despite concurrency and failures.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Contributions to Building Efficient and Robust State-Machine Replication Protocols

State machine replication (SMR) is a software technique for tolerating failures using commodity hardware. The critical service to be made fault-tolerant is modeled by a state machine. Several, possibly different, copies of the state machine are then deployed on different nodes. Clients of the service access the replicas through a SMR protocol which ensures that, despite concurrency and failures...

متن کامل

Practical Hardening of Crash-Tolerant Systems

Recent failures of production systems have highlighted the importance of tolerating faults beyond crashes. The industry has so far addressed this problem by hardening crash-tolerant systems with ad hoc error detection checks, potentially overlooking critical fault scenarios. We propose a generic and principled hardening technique for Arbitrary State Corruption (ASC) faults, which specifically m...

متن کامل

Paxos Replicated State Machines as the Basis of a High-Performance Data Store

Conventional wisdom holds that Paxos is too expensive to use for high-volume, high-throughput, data-intensive applications. Consequently, fault-tolerant storage systems typically rely on special hardware, semantics weaker than sequential consistency, a limited update interface (such as append-only), primary-backup replication schemes that serialize all reads through the primary, clock synchroni...

متن کامل

From Viewstamped Replication to Byzantine Fault Tolerance

The paper provides an historical perspective about two replication protocols, each of which was intended for practical deployment. The first is Viewstamped Replication, which was developed in the 1980’s and allows a group of replicas to continue to provide service in spite of a certain number of crashes among them. The second is an extension of Viewstamped Replication that allows the group to s...

متن کامل

Responsive Security for Stored Data

We present the design of a distributed store that offers various levels of security guarantees while tolerating a limited number of nodes that are compromised by an adversary. The store uses secret sharing schemes to offer security guarantees namely availability, confidentiality and integrity. However, a pure secret sharing scheme could suffer from performance problems and high access costs. We...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2005

Tolerating Arbitrary Failures With State Machine Replication

نویسندگان

چکیده

منابع مشابه

Contributions to Building Efficient and Robust State-Machine Replication Protocols

Practical Hardening of Crash-Tolerant Systems

Paxos Replicated State Machines as the Basis of a High-Performance Data Store

From Viewstamped Replication to Byzantine Fault Tolerance

Responsive Security for Stored Data

عنوان ژورنال:

اشتراک گذاری